NTpred framework for TYROSINE NITRATION inference On Unseen Data :

Users can feed raw sequences in the text field, or upload a Fasta file containing raw sequences.

Input sequence in text field should be comma separated raw sequences.
Input Fasta file should contain: Sequence name, Sequence
The framework predictions are saved in CSV file that can be downloaded.
Furthermore, our framework predicts the Tyrosine nitration sites at multiple positions in an input sequence, based on the position of y in the sequence.
- Lets us assume a sample input sequence in the Fasta file:
  > sample_1
  ITILSYHSSIGVRKDELVHGYILVYSAKRKASMGMLRAFLS
- In this sample, Y occurs at locations 6, 21 and 25.
- Then, the output prediction CSV file will have 3 rows for this sequence:
  - Sequence_name, Sequence, Probability, Class
  - sample_1__6, ITILSYHSSIGVRKDELVHGYILVYS, 0.1057, 0
  - sample_1__21, ITILSYHSSIGVRKDELVHGYILVYSAKRKASMGMLRAFLS, 0.0611, 0
  - sample_1__25, SYHSSIGVRKDELVHGYILVYSAKRKASMGMLRAFLS, 0.06139, 0
- The column is same as input fasta file, with "__location" appended that provides the location of the Y residue in the protein sequence.
The framework predictions are saved in CSV file that can be downloaded.
- The CSV file contains four columns: Sequence_name, Sequence, Probability, Class
- Sequence_name column denotes the name of sequence
- Sequence column represents the sequence used
- "Probability" column provides probability of presence of Tyrosine Nitration site.
- Class column translates the probability into class label, where 1 denotes positive Tyrosine Nitration site, and 0 denotes negative.

TRAINING THE HYBRID ENSEMBLE ARCHITECTURE FROM SCRATCH:

The NTpred framework can be utilized to perform experimentation in k-fold Cross Validation and Independent Test settings. To perform experimentation in both settings, training data should be provided in a standard Fasta format.
The Fasta record header should follow:
- Sequence_Name|Class>|Label. For example: sample_1|0|training
- Sequence_Name should be unique
- Class should contain either 1 or 0, denoting the sequence as positive or negative site.
- Label is a random placeholder value
Users can choose "Kfold" or "Standard" training mode.
- "Kfold" training mode performs a K-fold evaluation of NTpred framework on the provided training data.
- "Standard" training mode can be used to perform Independent test setting. A trained model is deployed using the user provided training data. The trained model can be used for Prediction by the user.